Using a Dynamic Schedule to Increase the Performance of Tiling in Stencil Computations

نویسنده

  • Michael Freitag
چکیده

A stencil computation determines the values of points in a grid of some dimensionality by repeatedly evaluating a given function of a grid point and its neighbors. The parallelization and optimization of stencil computations are subject of ongoing research. The most prevalent approach is the subdivision of the iteration domain into smaller pieces, called tiles. We give an overview of a method to increase the performance of one such tiling algorithm further by employing a dynamic schedule for tile processing, improving both load balance and cache efficiency. A set of onedimensional stencil benchmarks exhibits a performance increase of up to 20% in comparison to the Pochoir stencil compiler.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

University of Delaware Department of Electrical and Computer Engineering Computer Architecture and Parallel Systems Laboratory Diamond Tiling: A Tiling Framework for Time-iterated Scientific Applications

This paper fully develops Diamond Tiling, a technique to partition the computations of stencil applications such as FDTD. The Diamond Tiling technique is the result of optimizing the amount of useful computations that can be executed when a region of memory is loaded to the local memory of a multiprocessor chip. Diamond Tiling contributes to the state of the art on time tiling techniques in tha...

متن کامل

An Auto-tuning Jit Compiler for Accelerating Multiple Stencil Computations

We present a JIT compiler with auto-tuning capabilities fusing multiple stencil computations. Data arrays for scientific computing of image processing often exceed cache-memory size. To take advantage of spatial and temporal locality, a common method is to partition the images into tiling blocks for multicore architectures. In realistic scenarios, the multiple image algorithms, most of which ar...

متن کامل

Writing productive stencil codes with overlapped tiling ‡ 3

Stencil computations constitute the kernel of many scientific applications. Tiling is often used to improve 11 the performance of stencil codes for data locality and parallelism. However, tiled stencil codes typically require shadow regions, whose management becomes a burden to programmers. In fact, it is often the 13 case that the code required to manage these regions, and in particular their ...

متن کامل

Compilers for Regular and Irregular Stencils: Some Shared Problems and Solutions

Solving partial differential equations results in a continuum of regular and irregular stencil computation implementations. In this paper, we use heat diffusion on a bar to show how regular and irregular stencil computations are related, and then illustrate five complicating issues that occur in implementing the continuum of regular and irregular stencil computations in full applications. These...

متن کامل

Improving the arithmetic intensity of multigrid with the help of polynomial smoothers

SUMMARY The basic building blocks of a classic multigrid algorithm, which are essentially stencil computations, all have a low ratio of executed floating point operations per byte fetched from memory. This important ratio can be identified as the arithmetic intensity. Applications with a low arithmetic intensity are typically bounded by memory traffic and achieve only a small percentage of the ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014